Neural Network Compression Framework for Fast Model Inference

نویسندگان

چکیده

We present a new PyTorch-based framework for neural network compression with fine-tuning named Neural Network Compression Framework (NNCF) (https://github.com/openvinotoolkit/nncf) . It leverages recent advances of various methods and implements some them, namely quantization, sparsity, filter pruning binarization. These allow producing more hardware-friendly models that can be efficiently run on general-purpose hardware computation units (CPU, GPU) or specialized deep learning accelerators. show the implemented their combinations successfully applied to wide range architectures tasks accelerate inference while preserving original model’s accuracy. The used in conjunction supplied training samples as standalone package seamlessly integrated into existing code minimal adaptations.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

GAZELLE: A Low Latency Framework for Secure Neural Network Inference

The growing popularity of cloud-based machine learning raises a natural question about the privacy guarantees that can be provided in such a setting. Our work tackles this problem in the context where a client wishes to classify private images using a convolutional neural network (CNN) trained by a server. Our goal is to build efficient protocols whereby the client can acquire the classificatio...

متن کامل

A New Fast Neural Network Model

In this paper, a new model for testing patterns with neural networks is presented. The idea is to accelerate the operation of testing patterns by using neural networks. This is done by applying cross correlation between the input patterns and the input weights of neural networks in the frequency domain rather than time domain. Furthermore, such model is very useful for understanding the interna...

متن کامل

ASCOC: A Locally Recurrent Neural Network Model for Grammatical Inference

Grammatical Inference Dit-Yan Yeung, Kei-Wai Yeung Department of Computer Science Hong Kong University of Science and Technology Abstract| Many real-world problems require the handling of temporally related events. Recently, some recurrent neural network models were proposed to solve various temporal sequence processing problems. One of the most popular models is Elman's simple recurrent networ...

متن کامل

CALYPSO: A Neural Network Model for Natural Language Inference

The ability to infer meaning from text has long been regarded as one of the “benchmarks” of the quest to artificially approximate human intelligence. The field of Natural Language Inference explores this task by explicitly modeling inference relationships in natural language. In this work, we present the CALYPSO model, which builds upon Chen et al. ’16’s EBIM model by enhancing the matching lay...

متن کامل

Artificial Neural Network Model for Predicting Insurance Insolvency

In addition to its primary role of providing financial protection for other industries the insurance industry also serves as a medium for fund mobilization. In spite of the harsh economic environment in Nigeria, the insurance industry has been crucial to the consummation of business plans and wealth creation.  However, the continued downturn experienced by many countries, in the last decade, se...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Lecture notes in networks and systems

سال: 2021

ISSN: ['2367-3370', '2367-3389']

DOI: https://doi.org/10.1007/978-3-030-80129-8_17